Pesquisa | Biblioteca Virtual em Saúde

Bioinformatics pipeline for the systematic mining genomic and proteomic variation linked to rare diseases: The example of monogenic diabetes.

Kuznetsova, Ksenia G; Vasícek, Jakub; Skiadopoulou, Dafni; Molnes, Janne; Udler, Miriam; Johansson, Stefan; Njølstad, Pål Rasmus; Manning, Alisa; Vaudel, Marc.

PLoS One ; 19(4): e0300350, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38635808

RESUMO

Monogenic diabetes is characterized as a group of diseases caused by rare variants in single genes. Like for other rare diseases, multiple genes have been linked to monogenic diabetes with different measures of pathogenicity, but the information on the genes and variants is not unified among different resources, making it challenging to process them informatically. We have developed an automated pipeline for collecting and harmonizing data on genetic variants linked to monogenic diabetes. Furthermore, we have translated variant genetic sequences into protein sequences accounting for all protein isoforms and their variants. This allows researchers to consolidate information on variant genes and proteins linked to monogenic diabetes and facilitates their study using proteomics or structural biology. Our open and flexible implementation using Jupyter notebooks enables tailoring and modifying the pipeline and its application to other rare diseases.

Assuntos

Diabetes Mellitus , Proteômica , Humanos , Doenças Raras/genética , Genômica , Biologia Computacional , Diabetes Mellitus/genética

Retention Time and Fragmentation Predictors Increase Confidence in Identification of Common Variant Peptides.

Skiadopoulou, Dafni; Vasícek, Jakub; Kuznetsova, Ksenia; Bouyssié, David; Käll, Lukas; Vaudel, Marc.

J Proteome Res ; 22(10): 3190-3199, 2023 Oct 06.

Artigo em Inglês | MEDLINE | ID: mdl-37656829

RESUMO

Precision medicine focuses on adapting care to the individual profile of patients, for example, accounting for their unique genetic makeup. Being able to account for the effect of genetic variation on the proteome holds great promise toward this goal. However, identifying the protein products of genetic variation using mass spectrometry has proven very challenging. Here we show that the identification of variant peptides can be improved by the integration of retention time and fragmentation predictors into a unified proteogenomic pipeline. By combining these intrinsic peptide characteristics using the search-engine post-processor Percolator, we demonstrate improved discrimination power between correct and incorrect peptide-spectrum matches. Our results demonstrate that the drop in performance that is induced when expanding a protein sequence database can be compensated, hence enabling efficient identification of genetic variation products in proteomics data. We anticipate that this enhancement of proteogenomic pipelines can provide a more refined picture of the unique proteome of patients and thereby contribute to improving patient care.

Finding haplotypic signatures in proteins.

Vasícek, Jakub; Skiadopoulou, Dafni; Kuznetsova, Ksenia G; Wen, Bo; Johansson, Stefan; Njølstad, Pål R; Bruckner, Stefan; Käll, Lukas; Vaudel, Marc.

Gigascience ; 122022 12 28.

Artigo em Inglês | MEDLINE | ID: mdl-37919975

RESUMO

BACKGROUND: The nonrandom distribution of alleles of common genomic variants produces haplotypes, which are fundamental in medical and population genetic studies. Consequently, protein-coding genes with different co-occurring sets of alleles can encode different amino acid sequences: protein haplotypes. These protein haplotypes are present in biological samples and detectable by mass spectrometry, but they are not accounted for in proteomic searches. Consequently, the impact of haplotypic variation on the results of proteomic searches and the discoverability of peptides specific to haplotypes remain unknown. FINDINGS: Here, we study how common genetic haplotypes influence the proteomic search space and investigate the possibility to match peptides containing multiple amino acid substitutions to a publicly available data set of mass spectra. We found that for 12.42% of the discoverable amino acid substitutions encoded by common haplotypes, 2 or more substitutions may co-occur in the same peptide after tryptic digestion of the protein haplotypes. We identified 352 spectra that matched to such multivariant peptides, and out of the 4,582 amino acid substitutions identified, 6.37% were covered by multivariant peptides. However, the evaluation of the reliability of these matches remains challenging, suggesting that refined error rate estimation procedures are needed for such complex proteomic searches. CONCLUSIONS: As these procedures become available and the ability to analyze protein haplotypes increases, we anticipate that proteomics will provide new information on the consequences of common variation, across tissues and time.

Assuntos

Proteínas , Proteômica , Proteômica/métodos , Haplótipos , Reprodutibilidade dos Testes , Proteínas/genética , Peptídeos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA